Use the code below to download the data using PATE functions which stores the output on GCS
pate-rnd xmap-preprocessor --include_file_str Carryover_cross
pate-rnd xmap-collector --include_file_str Carryover \
--out_file_name mv_carryover_study.csv
Using the command line tools, we import the data from GCS to JH
gsutil cp gs://inhouse-xmap-data/user_data_frames/mv_carryover_study.csv
/home/ddhillon/projects/beta-av-testing/data/raw/mv_carryover_study.csv
This page details the design
2 plates - only assay buffer and 2 additional plates with a checkerboard pattern
Note The way the experiment was planned was to first run 2 blank plates and then the 2 checkerboard pattern of plates. However, it looks like this was not the case. The assay buffer only and checkerboard plates were run alternate order.
We will only be able to use the plate 1 to estimate what the distribution of the blanks look like. But in a way, this allows us to see that setting our acceptance criteria based off of one plate only isn’t appropriate - we can see this under the Estimate LoB tab where if we apply the metric set by plate 1 on plate 4, it fails the acceptance criteria.
In addition, we also find some row effects in the sample intensities. This is reflective of the fact that xmap moves from A1-A24, and then back down to B1. This means, the last row gets read at the very end, which is what we see here.
Here we look at the overall distribution of the assay buffer wells in both the sample plates and the blank plate. The distribution of the MFIs look very similar across the two plates.
xMAP reads from A1 - A24 and then comes back and reads B1 - signal decreasing over rows?
Should we have one set of standards on the top 2 rows and another on the bottom 2 rows?
Ensure model training samples are randomized on the plate to avoid any plate effect confounds?
Average MFI for assay buffer in each row
Average HC by row
Average MFI for assay buffer in each row
Average HC by row
From the Plate 1 - we will estimate the LoB using 2 methods
1. 95th percentile
2. 3 SD
lob <- raw_data %>%
filter(plate == "_Plate1") %>%
filter(xponent_id == "Assay Buffer") %>%
group_by(assay) %>%
summarise(p_95 = quantile(median_mfi, 0.95),
sd_3 = round(mean(median_mfi) + 3*sd(median_mfi), 3)) %>%
ungroup()
lob %>%
DT::datatable()
How many times do the assay buffer wells in the checkerboard plate exceed the p95
How many times do the assay buffer wells exceed the 3SD LOB
If we assume Normal Distribution, this should be < 0.3 %
No wells in TNC exceed 3SD
< 1 % of the wells for CEA and WFDC2 exceed the 3SD limit
What if we applied this criteria on plate 4
Even for a purely blank plate, we still find that for CEA and TNC, > 6 % of the wells exceed the p95, even though only 5% should.
For carryover, we will only look at Plate 2 and Plate 3 which are checkerboard pattern plates.
P95
When we see how many times assay buffer wells in the sample plates exceed p95, this is > 5 %, whereas we would expect <= 5 % to be exceed this. Since we only used plate 1 in the estimation of p95, even when we apply this critera in plate 4, we still see > 5 % failures.
So in our final study, we want to run 2-3 blanks plates across different instruments to estimate this threshold and then run our checkerboard plates.
3SD
When we do the same thing with 3SD LoB estimate, we see that
1. CEA exceeds 0.85 % of the time
2. TNC does not exceed at all 3. WFDC2 exceeds 0.28 % of the time.
Assuming normal distribution, we would expect this to occur < 0.3 % of the time.
It is high risk to set our acceptance criteria using the LoB generated from the LoX studies. Instead, we could run multiple blank plates on the same instruments to estimate the 3SD of our assay buffer and then run the checkerboard plates.
Output
gsutil cp /home/ddhillon/projects/beta-av-testing/notebooks/carryover-crosscontamination/outputs/dev-study-01.html gs://freenome-user-data-ddhillon/outputs